Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot

نویسندگان

Gen Endo

Jun Morimoto

Takamitsu Matsubara

Jun Nakanishi

Gordon Cheng

چکیده

In this paper we describe a learning framework for a central pattern generator (CPG)-based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve CPG-based biped walking with a 3D hardware humanoid and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feedback controller can be acquired within a few thousand The International Journal of Robotics Research Vol. 27, No. 2, February 2008, pp. 213–228 DOI: 10.1177/0278364907084980 c SAGE Publications 2008 Los Angeles, London, New Delhi and Singapore trials by numerical simulations and the controller obtained in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluate the walking velocity and stability. The results suggest that the learning algorithm is capable of adapting to environmental changes. Furthermore, we present an online learning scheme with an initial policy for a hardware robot to improve the controller within 200 iter-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning CPG Sensory Feedback with Policy Gradient for Biped Locomotion for a Full-Body Humanoid

This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feedback contro...

متن کامل

Reinforcement Learning for a CPG-driven Biped Robot

Animal’s rhythmic movements such as locomotion are considered to be controlled by neural circuits called central pattern generators (CPGs). This article presents a reinforcement learning (RL) method for a CPG controller, which is inspired by the control mechanism of animals. Because the CPG controller is an instance of recurrent neural networks, a naive application of RL involves difficulties. ...

متن کامل

Reinforcement Learning for CPG-Driven Biped Robot

متن کامل

Optimized Joint Trajectory Model with Customized Genetic Algorithm for Biped Robot Walk

Biped robot locomotion is one of the active research areas in robotics. In this area, real-time stable walking with proper speed is one of the main challenges that needs to be overcome. Central Pattern Generators (CPG) as one of the biological gait generation models, can produce complex nonlinear oscillation as a pattern for walking. In this paper, we propose a model for a biped robot joint tra...

متن کامل

Dynamic Control Algorithm for Biped Walking Based on Policy Gradient Fuzzy Reinforcement Learning

This paper presents a novel dynamic control approach to acquire biped walking of humanoid robots focussed on policy gradient reinforcement learning with fuzzy evaluative feedback . The proposed structure of controller involves two feedback loops: conventional computed torque controller including impact-force controller and reinforcement learning computed torque controller. Reinforcement learnin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

I. J. Robotics Res.

دوره 27 شماره

صفحات -

تاریخ انتشار 2008

Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot

نویسندگان

چکیده

منابع مشابه

Learning CPG Sensory Feedback with Policy Gradient for Biped Locomotion for a Full-Body Humanoid

Reinforcement Learning for a CPG-driven Biped Robot

Reinforcement Learning for CPG-Driven Biped Robot

Optimized Joint Trajectory Model with Customized Genetic Algorithm for Biped Robot Walk

Dynamic Control Algorithm for Biped Walking Based on Policy Gradient Fuzzy Reinforcement Learning

عنوان ژورنال:

اشتراک گذاری